An Ensemble Model using Genetic Algorithm for Feature Selection and rule mining using Apriori and FP-growth from Cancer Microarray data
نویسندگان
چکیده
Several dimension reduction techniques for microarray data have been developed over the years to get differentially expressed genes. However, a very little consensus in their resultant feature subsets for a particular dataset has been noticed. Therefore, to address the aforesaid issue an ensemble of feature selection technique is proposed in this paper. The ensemble is a combination of well balanced collection of filter and wrapper based feature selection methods. For further refinement of the resultant output of ensemble, we have taken genetic algorithm in pipeline to produce a non-local set of robust feature subset. An extensive computational experiment has been carried out on prostate cancer dataset for validation of the method. In addition, we have compared the performance of our method with group genetic algorithm (GGA). Finally, the resultant feature subsets of GA, GGA, and other constituents of ensemble in standalone mode are used for uncovering frequent patterns based on two popular association rule mining like Apriori and FP-growth. The experimental results confirm that the proposed method gives a stable result and is very effective in comparison to GGA for attribute clustering in the selection of relevant features.
منابع مشابه
The prediction of lymphedema via the combination of the selected data mining algorithms
Background: Breast cancer is the second leading cause of cancer death in women, after lung cancer. Due to the importance of predicting this disease, the use of data mining methods in medical research is more significant than before. Data mining algorithms can be a great help in preventing the development of lymphedema in patients. The aim Of this study was to create a diagnosis system that can ...
متن کاملCancer Detection using Frequency Pattern Ant Colony Optimization
Over the past few decades, to computerized diagnostic tools, intended to aid expert in making sense out of the welter of data. Due to improvements in biometric instrumentation and automation, it is easy to collect a lot of experimental data in molecular biology. It is extremely important for Analysis of such data as it leads to knowledge discovery that can be validated by experiments. Several m...
متن کاملA Novel Method for Selecting the Supplier Based on Association Rule Mining
One of important problems in supply chains management is supplier selection. In a company, there are massive data from various departments so that extracting knowledge from the company’s data is too complicated. Many researchers have solved this problem by some methods like fuzzy set theory, goal programming, multi objective programming, the liner programming, mixed integer programming, analyti...
متن کاملFeature selection using genetic algorithm for breast cancer diagnosis: experiment on three different datasets
Objective(s): This study addresses feature selection for breast cancer diagnosis. The present process uses a wrapper approach using GA-based on feature selection and PS-classifier. The results of experiment show that the proposed model is comparable to the other models on Wisconsin breast cancer datasets. Materials and Methods: To evaluate effectiveness of proposed feature selection method, we ...
متن کاملNew Approaches to Analyze Gasoline Rationing
In this paper, the relation among factors in the road transportation sector from March, 2005 to March, 2011 is analyzed. Most of the previous studies have economical point of view on gasoline consumption. Here, a new approach is proposed in which different data mining techniques are used to extract meaningful relations between the aforementioned factors. The main and dependent factor is gasolin...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017